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DOMAIN DECOMPOSITION PRECONDITIONING EOR 
HIGH-EREQUENCY HELMHOLTZ PROBLEMS WITH ABSORPTION 

LG. GRAHAM, E.A. SPENCE, AND E. VAINIKKO 


Abstract. In this paper we give new results on domain decomposition preconditioners for GM- 
RES when computing piecewise-linear finite-element approximations of the Helmholtz equation 
—Au — (k?- -h \e)u = /, with absorption parameter e: € M. Multigrid approximations of this 
equation with g: 7 ^ 0 are commonly used as preconditioners for the pure Helmholtz case (g: = 0). 
However a rigorous theory for such (so-called “shifted Laplace”) preconditioners, either for the 
pure Helmholtz equation, or even the absorptive equation (g: 7 ^ 0), is still missing. We present 
a new theory for the absorptive equation that provides rates of convergence for (left- or right-) 
preconditioned GMRES, via estimates of the norm and field of values of the preconditioned ma¬ 
trix. This theory uses a k- and e-explicit coercivity result for the underlying sesquilinear form 
and shows, for example, that if |e| k^, then classical overlapping additive Schwarz will per¬ 

form optimally for the absorptive problem, provided the subdomain and coarse mesh diameters 
are carefully chosen. Extensive numerical experiments are given that support the theoretical 
results. While the theory applies to a certain weighted variant of GMRES, the experiments for 
both weighted and classical GMRES give comparable results. The theory for the absorptive case 
gives insight into how its domain decomposition approximations perform as preconditioners for 
the pure Helmholtz case g: = 0. At the end of the paper we propose a (scalable) multilevel pre¬ 
conditioner for the pure Helmholtz problem that has an empirical computation time complexity 
of about 0 (n^/^) for solving finite element systems of size n = 0{k^), where we have chosen the 
mesh diameter h k~^^^ to avoid the pollution effect. Experiments on problems with h k~^, 
i.e. a fixed number of grid points per wavelength, are also given. 


1. Introduction 


This paper is concerned with domain-decomposition preconditioning for finite-element discretisa¬ 
tions of the boundary value problem 


( 1 . 1 ) 


r —Au — + ie)M = / in n, 
I du/dn—irju =g on F, 


with k > 0 and 77 = e), where either (i) 17 is a bounded domain in with boundary F or (ii) 17 

is the exterior of a bounded scatterer, F denotes an approximate far field boundary, and the problem 
is appended with a homogeneous Dirichlet condition on the boundary of the scatterer. Although 
the PDE in (1.1) is relevant in applications, our main motivation for studying this problem is its 
recent use in preconditioning the corresponding BVP for the Helmholtz equation: 


( 1 . 2 ) 


J —Au — k^u = f in 17, 
1 du/dn — iku = g on F, 


Linear systems arising from finite element approximations of (1.1) with high wavenumber k are 
notoriously hard to solve. Because the system matrices are non-Hermitian and generally non¬ 
normal, general iterative methods like preconditioned (F) GMRES have to be employed. Analysing 
the convergence of these methods is hard, since an analysis of the spectrum of the system matrix 
alone is not sufficient for any rigorous convergence estimates. 

The idea of preconditioning discretisations of (1.2) with approximate discretisations of (1.1) is 
often called “shifted Laplacian” preconditioning. From its origins in [17], this idea has had a large 
impact on the field of practical fast Helmholtz solvers. The main aim of the present paper is to 
provide theoretical underpinning for this idea, and to nse this theoretical understanding to develop 
new preconditioners for (1.2). 


Key words and phrases. Helmholtz equation, high frequency, absorption, iterative solvers, preconditioning, do¬ 
main decomposition, GMRES. 
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We denote the system matrix arising from continuous piecewise linear (PI) Galerkin finite 
element approximations of (1.1) by (or simply A when e = 0). For the solution of “pure 
Helmholtz” systems Au = f, the “shifted Laplacian” preconditioning strategy (written in left¬ 
preconditioning mode), involves iteratively solving the equivalent problem 

(1.3) B-^Au = 

where B~^ is some readily computable approximation of Aj^ (for example a multigrid V-cycle). 
The rigorous analysis of the performance of this preconditioner is complicated, partly because it is 
based on a double approximation: A~^ m Aj^ r; and partly because the convergence theory 
of GMRES for non-self-adjoint systems requires one to estimate either the field of values of the 
system matrix or the spectrum and its conditioning. 

One natural approach is to write 

(1.4) I-B-^A = I-B-^A^ + B-^A,{I - AJ^A) , 

and to recall that a sufficient (but by no means necessary) condition for GMRES to converge 
quickly is that the field of values of the system matrix should be bounded away from the origin 
and the norm of the system matrix should be bounded above. It is therefore clear from (1.4) that 
sufficient conditions for B~^ to be a good preconditioner for A are: 

(i) Aj^ is a good preconditioner for A 

and 

(ii) B~^ is a good preconditioner for A^. 

Achieving both (i) and (ii) simultaneously imposes contradictory requirements on e. Indeed, it 
is natural to expect that (i) holds if |e| is sufficiently small, but that for (ii) to hold we need |e| 
sufficiently large. Most analyses of the performance of B~^ as a preconditioner for A have focused 
on obtaining conditions under which property (i) holds and have concentrated on analysing spectra. 
While a detailed literature survey is given in [21, §1.1], an up-to-date summary of this is given at 
the end of this section. 

In [21] we gave the first rigorous theory that identified conditions that ensure (i) above holds. 
There, under general conditions on the domain and mesh sequence, we showed that when \e\/k 
was bounded above by a sufficiently small constant then (i) holds. 

The main theoretical purpose of the current paper is to obtain sufficient conditions for (ii) 
to hold in the case when B~^ is chosen as a classical Additive Schwarz preconditioner for A^. 
We use the rigorous convergence theory of [12] (see also [4], [35, §1.3.2]), in which criteria for 
convergence of GMRES are given in terms of an upper bound on the norm of the system matrix 
and a lower bound on the distance of its field of values from the origin. In the Additive Schwarz 
construction, the domain is covered with overlapping subdomains with diameter denoted Hguh and 
also triangulated with a coarse mesh with diameter denoted H. (It is not necessary for Hsnh and H 
to be related.) The overlap parameter is denoted d, and 6 ^ H corresponds to “generous overlap”. 
Further technical requirements are given in §3. 

We highlight at this stage that the conditions on jej that we find for (ii) above to hold do not 
overlap with those described above for (i) to hold, and thus the combination of this paper with [21] 
does not provide a complete theory for preconditioning the Helmholtz equation with absorption. 
Nevertheless 

(a) we believe that the present paper combined with [21] constitute the only rigorous results 
in the literature addressing when either of the properties (i) or (ii) above hold, 

(b) the investigation into the property (ii) in the present paper, combined with the knowledge 
from [21] about the property (i), gives insight into how to design a good preconditioner 
for A (albeit one currently without a rigorous convergence theory); this is especially true 
when considering multilevel methods - see the discussion around Experiments 2 and 3 in 
§ 6 . 

1.1. Summary of main theoretical results. Throughout the paper we assume that 0 <jej ;< 
so the ratio is always bounded above, but may approach zero as fc —> oo. Our main theoretical 

results are Theorems 5.6, and 5.8 and their corollaries, which are proved in §5. Theorem 5.6 
examines the left-preconditioned matrix: B~^A^, and obtains an upper bound on its norm and a 
lower bound on its field of values. The upper bound on the norm is 0{k‘^l\e\)^ while the distance 
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of the field of values from the origin (in the case of generous overlap) has a lower bound of order 
0((|e|/fc^)^). These bounds are obtained subject to the subdomain and coarse mesh diameters 
satisfying bounds: kHgub ^ |e|/^^ and kH < {\e\/k'^)^. 

An important special case is that of maximum absorption jej ^ k^. Then the results imply that 
the number of GMRES iterates will be bounded independently of k, provided i/sub and H both 
decrease with order k~^. Thus, provided there is enough absorption in the system, the GMRES 
method will perform analogously to the Laplacian case, provided the coarse mesh decreases pro¬ 
portional to the wavelength (i.e. no further refinement for pollution is needed). Thus if the mesh 
diameter h of the fine grid decreases as 0{k~^^'^) (needed to remove pollution in the underlying 
discretization), then considerable coarsening can be carried out. We actually see in the numeri¬ 
cal experiments in §6 that further coarsening beyond the k~^ theoretical limit may be possible, 
depending on the choice of e.) Analogous results for right preconditioning (obtained by a duality 
argument) are given in Theorem 5.8. Then Gorollaries 5.7 and 5.9 give the corresponding estimates 
for GMRES convergence for each of these preconditioners. 

1.2. How the theoretical results were obtained. As in classical Schwarz theory, the proofs 
of Theorems 5.6 and 5.8 are obtained from a projection operator analysis (given in §4). However, 
in order to get good results for large k we do not use the classical approach of treating the 
Helmholtz operator as a perturbation of the Laplacian, as was done in [4] (see also [23] , where this 
approach was used for the time-harmonic Maxwell equations). Rather, we exploit the coercivity 
of the problem with absorption (Lemma 2.4), leading to a projection analysis in the wavenumber- 
dependent inner product (u-ji.fc- The norm of the projection operator corresponding to the two- 
level algorithm is estimated above in Theorem 4.3, while the distance of its field of values from 
the origin is estimated below in Theorem 4.17. The analysis depends on a technical estimate on 
the approximation power of the coarse space (Assumption 4.6). We prove this estimate for convex 
polygons (Theorem 4.7), and we also outline how to prove it for more general 2- and 3-d domains 
(Remark 4.9). 

The estimates for the projection operators in §4 are converted to estimates for the norm and 
field of values of preconditioned Helmholtz matrices in §5. Because the analysis is performed in 
the “energy” inner product || • ||i_fc, the corresponding matrix estimates are obtained in the induced 
weighted Euclidean inner product. (A similar situation arises in the classical analysis [5].) We 
performed numerical experiments both for standard GMRES (with residual mininmization in the 
Euclidean norm) and for weighted GMRES (minimizing in the weighted norm), but in practice 
there was little difference in the results. 

1.3. Overview of numerical results. A sequence of numerical experiments is given in §6 for 

solving systems with matrix with h ~ k~^^^ (n ~ k^, where n is the system dimension), yielding 
(empirically) pollution-free finite element solutions. In these experiments H ~ Lfsub- First, we 
consider the performance of the preconditioner B~^ (defined by the classical Additive Schwarz 
method), when applied to problems with coefficient matrix A^. As predicted by the theory, we 
see that is an optimal preconditioner when jej ~ (i.e. the number of GMRES iterates is 

parameter independent), provided the coarse grid diameter H and subdomain diameter iJgub are 
sufficiently small. Experimentally, good results are also obtained even with larger H, iLgub when e 
is large enough, and even with smaller e when iJ, iJgub are small enough. We also test variants of 
the classical method, including Restricted Additive Schwarz (RAS) and the Hybrid variant of this 
(HRAS) (where coarse and local parts of the preconditioner are combined multiplicatively). Out 
of all the methods tested, HRAS performs the best. 

Based on this empirical insight gained about preconditioning A^, we then investigate the per¬ 
formance of HRAS (with absorption e) as a preconditioner for the pure Helmholtz problem with 
coefficient matrix A. We find that HRAS still works well, provided H and iLsub are small enough. 
There is surprisingly little variation in the performance with respect to the choice of e. (In fact 
with jej = k^, the performance is almost uniform in the range /3 € [0,1.2] but there is some degra¬ 
dation as P approaches 2. This is surprising as the choice /3 ^ 2 is normally used in the multigrid 
context. We also test a variant of HRAS that uses impedance conditions on subdomain solves and 
this works well, especially for larger H, iLgub- 

Finally, to solve problems with matrix A in the case of large fc, we recommend an inner-outer 
preconditioner for use within FGMRES, where the outer solver is HRAS with jej = k and H ~ 
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Hsuh ~ k~^. The cost of the preconditioner is then dominated by the coarse grid problem, and 
for this we apply an inner iteration with preconditioner chosen as one-level HRAS with impedance 
boundary condition on local problems. With the best choice of e appearing to be jej ~ k, we 
find this solver has a compute time of about O(fc^) ^ for the 2D problems tested, up 

to fc = 100. This is a highly scalable preconditioner, whose action consists of inverting 0{k‘^) 
(parallel) finite-element systems of size 0{k) and an additional 0{k) finite-element systems of size 
0{k). Additional experiments, together with multilevel variations suitable for the case h ^ k~^ 
are given in [26]. 

1.4. Literature review. We finish this section with a short literature survey on this topic, be¬ 
ginning with the literature on preconditioning with absorption, and then briefly discussing domain 
decomposition methods for wave problems. 

The survey in [21, §1.1] focused on the spectral analyses in [17], [16], [43], [15, §5.1.2], [18], all of 
which concern the optimal choice of e for to be a good preconditioner for A, i.e. for property (i) 
above to hold. Several authors have considered the question of when multigrid methods converge 
when applied to the problem with absorption (i.e. A^); this is related to (but not the same as) 
the question of when property (ii) above holds. Cools and Vanroose [9] computed the “minimal 
shift” (defined as the smallest value of e for which every single eigenmode of the error is reduced 
through consecutive multigrid iterations) based on numerical evaluation of quantities arising from 
Fourier analysis, and found that (as a function of fc) it is proportional to fc^. Cocquet and Gander 
[8] (following on from [18]) showed that, for a particular standard variant of multigrid applied 
to the 1-d Helmholtz equation with Dirichlet boundary conditions, one needs jej ~ fc^ to obtain 
convergence independent of fc. They also analysed a less-standard variant of multigrid applied 
to general multi-dimensional Helmholtz problems with either Dirichlet or impedance boundary 
conditions, and showed that again one needs jej ^ fc^ for the method to be practical. Note that 
these analyses are concerned with the convergence of multigrid as a solver for A^, rather than using 
an approximation such as the V-cycle applied to the problem with absorption as a preconditioner 
for either the absorbing or the non-absorbing problem (A or A^ respectively). 

The study of non-overlapping domain-decomposition methods for wave problems has a long 
history, starting with the seminal paper of Benamou and Despres [2]. Following that, optimized 
interface conditions were introduced [22], the success of which sparked substantial interest, for 
example [20], [11], and more recently the “source transfer” and related methods [7], [6], and [40]; 
these latter methods can be viewed as putting the “sweeping” method of [14] in a continuous (as 
opposed to discrete) setting. All these non-overlapping domain decomposition methods focus on 
the choice of good interface conditions but so far do not provide a systematic method of combining 
these with coarse grid operators or a convergence analysis explicit in subdomain or coarse grid 
size. There are also a few results on overlapping domain decomposition methods e.g. [41], [30], 
[31], with the latter explicitly using absorption; these demonstrated the potential of the methods 
analysed in this paper. Finally, we note that [44] introduces a new sweeping-style method for the 
Helmholtz equation, and also contains a good literature review of both domain-decomposition and 
sweeping-style methods. 


2. Variational formulation 

For ease of exposition, we restrict attention to the interior impedance problem (i.e. (1.2) is 
posed for H a bounded domain in with boundary F). The results of the paper also hold for 
the truncated sound-soft scattering problem, and we outline in Remark 5.10 how to adapt them 
to this case. 

Let H be a bounded, open, polygonal (Lipschitz polyhedral) domain in d = 2 (or 3), with 
boundary F. We introduce the standard fc-weighted inner product and norm on 

(u,n;)i,fc = (Vw, Vn;)L 2 (Q)-p fc2(i;,'u;)i2(n) and \\v\\i^k = iv,v)yj . 

The standard variational formulation of (1.1) is: Given / G g G L^{T), e € K. and fc > 0 

find u G i7^(H) such that 


(2.1) 


ae{u,v) = F{v) for all v G i7^(H) 
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where 


( 2 . 2 ) 

and 

(2.3) 


ae{u^v) := / Vu ■ Vv — {k^ie) / uv — ir] uv^ 

Jo, Jq Jr 


F{v) := [ fv+ [ gv. 
Jn Jr 


In general g can be complex, with a natural choice being a square root of + ie. (more details 
are in Lemma 2.4). When e = 0 and 77 = fc we are solving (1.2) and we simply write a instead of 

We consider the discretisation of problem (2.1) with PI finite elements. Let be a family of 
conforming meshes (triangles in 2D, tetrahedra in 3D), that are shape-regular as the mesh diameter 
h —>■ 0. A typical element of is r G (a closed subset of D). Then our approximation space 
is the space of all continuous functions on D that are piecewise affine with respect to T^. (The 
impedance boundary condition in (1.2) is implemented as a natural boundary condition.) The 
freedoms for are the nodes, denoted = {xj : j G I^}, where is a suitable index set. The 
standard basis for is {(j)j : j G consisting of hat functions corresponding to the each of the 
nodes in A/”^. 

The Galerkin approximation of (2.1) in the space V'^ is equivalent to the system 


(2.4) 


AgU := {S — + \e)M — igN)^! = f. 


where 

(2.5) Se^rn= / Ml^rn = / 4>e4>m, Nl^rn = / m G 

Jq j q, Jr 

are, respectively, the stiffness matrix, the domain mass matrix, and the boundary mass matrix. 
Again we write the corresponding system matrix for (1.2) simply as A. Note that A and Ag are 
symmetric but not Hermitian. 

In this section we briefly provide the key properties of the sesquilinear form given in (2.2). 
This form depends on all three parameters e, k and g, but only the first of these is reflected in the 
notation. Normally g will be chosen as a function of e and k. We will assume throughout that 

( 2 . 6 ) |e| < and |? 7 | < k. 

(Here the notation A < B (equivalently B > A) means that A/B is bounded above by a constant 
independent of k, e, and mesh diameters h,Hsnh,H (the latter two introduced below). We write 
A ^ B when A < B an B < A. 

The proof of the first result is a simple application of the Cauchy-Schwarz and multiplicative 
trace inequalities - see, e.g., [21, Lemma 3.1(i)]. 


Lemma 2.1 (Continuity). If \g\ < k then, given fco > 0; there exists a Cc independent of k and e 
such that 


(2.7) |ae(7;,u;)| < C^\\v\\^ ^, ||w;||i 

for all k > ko and v,w € 


We now give a result about the coercivity of a^, which is a generalisation of [21, Lemma 3.1(ii)]. 
To state this we need to define 'fk'^ + ie, taking care to cater for both positive and negative e. We 
need to consider both positive and negative e since, whichever choice we make for the problem ( 1 . 1 ), 
the other forms the adjoint problem, and we need estimates on the solutions and sesquilinear forms 
for both problems (in particular, this is essential for analysing both left- and right-preconditioning). 

Definition 2.2. z(k,£) := \/k'^ \£ is defined via the square root with the branch cut on the 

positive real axis. Note that this definition implies that, when £ 0, 

( 2 . 8 ) 3(z) > 0 , sign(£) 5 i( 2 ;) > 0 , and z{k,—£) = —z{k, £). 


Proposition 2.3. With z{k,£) defined above, for all k > 0, 

I 2 I ^ fc2- 


(2.9) 


k and 
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Proof. Writing z = p + iq, we see that the definition of 2 : implies that 

P = if e > 0, p = — if e < 0, and q = ^/^ for all e ^ 0, 

where 


( 2 . 10 ) 


2 + k'^ 

P = - 


and q = 


2 'fk* + — kf 


2 ’ ^ 2 
(and denotes the positive real square root). Using (2.6) we therefore see that \p\ ^ k. Further¬ 
more, the definition of 2 implies that 2pq = e, and thus q = \q\ ^ kl/bl ~ |e|/^- Using (2.6) again, 
the estimates (2.9) follow. □ 

Lemma 2.4 (Coercivity). Let 2 = z(k,e) be as defined in Definition 2.2, and choose ry in (2.2) to 
satisfy the inequality 

(2.11) iR{zr]) > 0. 

Then there is a constant p > 0 independent of k and e such that 

(2.12) |ae(u,u)| > 3(0a£(u,u)) > 

for all k > 0 and v € H^{Q), where 0 = — 2 /| 2 |. 

Proof. Writing z = p + iq and using the definition of , we have 

ae{v,v) = ||Vu||i2(n) - (p-Hig)^||u||i2(n) - i??lklli2(r) ■ 

Therefore 

^[-{p-iq)ae{v,v)] = g|| Vu||^ 2 (n)-h g(p^-h g^)||u||^ 2 (f 2 ) + 5R [(p - ig)??] Iklli 2 (r) ■ 

Hence, dividing through by \z\ = ^Jp^ + and setting 0 = — 2 /I 2 I, we have 


3[0ae(u,i;)] = 


llVull 


L 2 (a) 


i^nni 


A 2 (n) 


I I rllL2(r) ■ 


The result then follows from condition (2.11) and the second estimate in (2.9). 


□ 


Remark 2.5 (Choices of r\ satisfying (2.11)). An obvious choice of rj that satisfies the eoercivity 
condition (2.11) is rj = z, for then 3 ^( 277 ) = 5ft(22) = \z\‘^ > 0, Another possible choice is rj = 
sign(e)fc, for then, by (2.8), we have IRfzr]) = sign(£:)5R(2)fc > 0. Note that both these choiees 
satisfy the condition on |? 7 | in (2.6). 

The fact that the choice of rj for coercivity to hold depends on the sign of e is expeeted, since the 
sign of e also dictates the properties of rj required for the problem (1.1) to be well posed. Indeed, 
repeating the usual argument involving Green’s identity (given for e = 0 in, e.g., [39, Theorem 6.5],) 
we see that if s > 0 we need > 0 for uniqueness and if e < 0 we need 5ft(7y) < 0. 

The eondition for coercivity (2.11) is more restrictive that the conditions for uniqueness. Indeed, 
since ^( 277 ) = 5R(2)5ft(77) -I- 3 ( 2 ) 3 ( 77 ), when e > 0, a suffieient condition to ensure (2.11) is ^( 77 ) > 
0, 3 ( 77 ) > 0. Similarly, when e < 0 a sufficient condition for (2.11) is ^( 77 ) < 0, 3 ( 77 ) > 0. 

This lemma immediately also gives us a result about the coercivity of the adjoint of a^, given 
by 

a*{u,v) = / Vu • Vu — (fc^ — ie) / uv + ir] uv. 

«/ o o r 

Corollary 2.6. Under assumption (2.11) we also have eoercivity of the adjoint form: 

(2.13) \a*e{v,v)\ > 3(00^(77,7;)) > p^-^\\v\\lk 

for all k > 0 and v G 11^(1}), where 0 = — 2 /| 2 |. 

Proof. Note that the adjoint form is simply a copy of the original form Og, but with parameters e 
and 77 replaced by e = —e and rj = —rj, and thus (by (2.8)) 2 = — 2 . The condition for coercivity 
of the adjoint form is then ^(zrj) > 0 , which is equivalent to condition ( 2 . 11 ). □ 


Remark 2.7. Throughout the paper we will always assume that e and rj are chosen so that con¬ 
ditions (2.6) and (2.11) hold, and so the forms Og and a* always will satisfy the continuity and 
coercivity estimates (2.7), (2.12) and (2.13). 
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3. Domain Decomposition 

To define appropriate subspaces of we start with a collection of open subsets {ilf : I = 
1,..., N} of that form an overlapping cover of fl, and we set fl Each is assumed 

to be non-empty and fig is assumed to consist of a union of elements of the mesh Th- Then, for 
each i = 1,..., N, we set 

Vi = {vh G : supp{vh) C 

Note that, since functions in are continuous, functions in Vi must vanish on the internal 
boundary dQi\T, but are unconstrained on the external boundary dVli fl E. The freedoms for Vi 
are denoted = {xj : j G where X^iVli) is a suitable index set. The basis for 

V^{Vli) can then be written {tpj : j G {Vli)} . 

Thus a solve of the Helmholtz problem (2.1) in the space Vi involves a Dirichlet boundary 
condition at internal boundaries and natural boundary condition at external boundaries (if any). 
The introduction of the absorption e 0 ensures such solves are always well-defined. Future work 
will consider the analysis of methods with other local boundary conditions (such as impedance or 
PML). Internal impedance conditions are considered in the experiments in §6. 

For J G Z^(f2i) and j' G we define the restriction matrix {Ri)j,j' ■= Sjjr. The matrix 
:= RiA^Rj is then just the minor of A^ corresponding to rows and columns taken from 
X^{VLi). One-level domain decomposition methods are constructed from the inverses A~j. More 
precisely, 

(3-1) Be~,AS,local ■= 

i 

is the classical one-level preconditioner for Hg with the subscript ‘‘‘'local” indicating that the solves 
are on local subdomains Hf. 

For the theory, we need assumptions on the shape of the subdomains and the size of the overlap, 
and we require any point in Q. to belong to a bounded number of overlapping subdomains. First, 
for simplicity we assume the subdomains are shape-regular Lipschitz polyhedra (polygons in 2D) 
of diameter Hi = diam(H^), with the volume of order ~ Hf and surface area ~ Hi~^ respectively. 
The coarse mesh diameter iEsub := max{iE^ N} is then a parameter in our estimates. 

Each fl; is required to have a large enough interior boundary, i.e. we require that 

(3.2) |afl; \ r| - for each 1. 

Concerning the overlap, for each i = 1,..., let denote the part of that is not overlapped 
by any other subdomains, and for ^ > 0 let denote the set of points in that are a distance 

no more than ^ from the boundary dVli. Then we assume that for some 5 > 0 and some 0 < c < 1 
fixed, 

(3.3) fli^cS C riiX^li C ni,s- 

Put more simply, the overlap is assumed to be uniformly of order S; the case S H is called 
“generous overlap”. Finally, we make the finite overlap assumption 

(3.4) #A(^) < 1, where A(£) = {£' : VLi n VLt ^ 0} . 

Two-level methods are obtained by adding a global coarse solve. Let {T^} be a sequence of 
shape-regular, simplicial meshes on O, with mesh diameter H. We assume that each element of 
consists of the union of a set of fine grid elements. The set of coarse mesh nodes is denoted by 
X^. The coarse space basis functions <l>p are taken to be the continuous PI hat functions on . 

From these functions we define the coarse space Vo := span{<i>p : p G X^} , which is a subspace 
of V^. Now, if we introduce the restriction matrix 

(3.5) (Po)pj := . J e X^ PG I", 

then the matrix 

(3.6) Ag^Q := RqA^R^ 

is the stiffness matrix for problem (1.2) discretised in Vo using the basis {ilip : p G X^}. Note that, 
due to the coercivity result Lemma 2.4, both A^^ and A^^i are invertible for all mesh sizes h and 
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all choices of e ^ 0. This is easily seen, since, for example, if = 0, where v is a vector defined 
on the freedoms then 0 = v*>l£_ov = ae{vH, vh), where vh = Vp^p and so 

0 = \aeivH,VH)\ > 

which immediately implies vh = 0, and thus v = 0. Similar arguments apply to and to the 
adjoints Al* £ = 0,..., iV. 

The classical Additive Schwarz method is 

(3-7) ^e.AS •= + ^e,AS,locals 

(i.e. the sum of coarse solve and local solves) with B~\g defined in (3.1). 


4. Theory of Additive Schwarz Methods 


The following theory establishes rigorously the powerful properties of the preconditioner (3.7) 
applied to if jej is sufficiently large and i7sub, H are sufficiently small. 

This theory was inspired by reading again the results in [4] where non-self-adjoint problems 
that were “close to” self-adjoint coercive problems were considered. Although our problem here is 
not close to a self-adjoint coercive one, and our technical tools are very different, [4] provided a 
framework that we were able to adapt into the following results. 

The first lemma is an extension of the familiar “stable splitting” property of domain decompo¬ 
sition spaces. This is well-known for the norm (see, e.g., [42]) but here we extend it to the case 
of the /c-weighted energy norm. 

Lemma 4.1. For all Vh , there exist vi G Vi for each ^ = 0, • • • ,N such that 

N N 

(4-1) Vh = '^Vi and '^\\ve\\lk ^ 

e=o £=o 



Proof. This is adapted from the proof of analogous results for Laplace problems; see, e.g., [42]. 
The proof starts by approximating Vh by the quasiinterpolant from the coarse space: 

To := 

pel" 


where 



and Wp = supp(<i>^) . 


Then, using the shape regularity of it is straightforward to show that 

(4-2) ||To||L2(n) < ||T/t||L2(n). 

Next, we take a partition of unity {xi '■ i = 1, ■ ■ ■, N} subordinate to the covering and set 
(4.3) ve = - Vo)) , 

where denotes nodal interpolation onto V^. The first equality in (4.1) follows easily after 
summation. Moreover the estimate 

/ tJ\ 

(4-4) ^ f 1 + j 


is familiar from results on self-adjoint coercive problems; see, e.g., [27, Theorem 3.8]. 

To obtain the second inequality in (4.1), we note that by definition of /^, we have, for any 
r € with T C and any x G t, we have 


\ve{x)\ 


< 


{Xl{vh-Vo)){x^^)(fi:{x) 


1/2 




/ t \|2 


^ l(T/,-To)(a;j)l 


- X! - Vo){x'])\ 




DOMAIN DECOMPOSITION FOR HIGH-FREQUENCY HELMHOLTZ 


9 


where {xj : j G I^(r)} denotes the nodes on r. Hence 

T(Z^l rCL^i 

(4-5) < V\\T\~^\\vh-vo\\l2(^) = \\vh - vo\\l2(^^). 

tcTu 

Thus, because of the finite overlap property (3.4), we have 

N 

(4-6) ^ lk^llL2(0) < Ikli ~'*^o|lL2(n) ^ lk/t|lL2(Q) + ||lIo||L2(f2) . 

Combination of this with (4.2) yields J2^=o Il^^lli 2 (n) Ik?i|li 2 (f 2 ) . Then multiplication by and 
combination with (4.4) gives the required result. □ 


The next lemma is a kind of converse to Lemma 4.1. Here the energy of a sum of components 
is estimated above by the sum of the energies. 

Lemma 4.2. For all choices of vg (zVg , £ = 0, • • • ,N, we have 

2 


(4.7) 


N 


N 

< ■ 

Lfe i=0 


Proof. Let 'Y^ denote the sum from £ = 1 to iV and and recall the notation A(£) introduced in 
i 

(3.4). Then, using several applications of the Cauchy-Schwarz inequality. 


E 


ve 


l,k 


^ e e' /i.fc 


= E E 

£ I'GAil) 


< 


EiNiii.M E 

y'GA(^) 


(4.8) 


< Eii^^IIm 




1/2 


1/2 


E f E 

e \i'eA{e) 

E#a(£) E 

' £'eA(£) 


1/2 


1/2 


< 


El 




where we used the finite overlap assumption (3.4). To obtain (4.7), we write 
^ 2 

(4.9) 


N 

E^^ 

£^0 


l.k 


\£=0 i =0 ) 1 r 


= Ikollufc + 2 Lo,E^^) + (E^^’E 

V £/ l.k \ e e 


Vi 


l.k 


Using the Cauchy-Schwarz and the arithmetic-geometric mean inequalities on the middle term we 
can estimate (4.9) from above in the form 


< 


2 

1,A; ' 


E 


VI 


l.k 


and the result follows from (4.8). 


□ 
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Now for each i = 0,N, we define linear operators Qe,i ■ H^{n) —>• Vi as follows. For each 
Vh € Qejv is defined to be the unique solution of the equation 

(4.10) ae(Qe,ev,Wh,i) = ae(v,Wh,e), Wh,e&V£. 

and we then define 

N 

Qe ~ ^ ( Qe,E • 
e=o 

The matrix representation of Qg corresponds to the action of the preconditioner (3.7) on the matrix 
(this will be shown in Theorem 5.4 below). In Theorems 4.3 and 4.17 below we estimate the 
norm and field of values of Qg, and this yields corresponding estimates for the norm and field of 
values of the preconditioned matrix in Theorems 5.6. Such projection analysis is commonplace in 
domain decomposition; however, as far as we are aware, this is the first place where the projection 
operators are defined using the Og sesquilinear form and analysed in the wavenumber-dependent 
II • ||i,fc energy norm. 

Theorem 4.3. (Upper hound on Qg) 

\\QeVh\\i,k < ||i^/i||i,fe for all Vh&V^. 


Proof. By the definition of Qe and Lemma 4.2, we have 

2 


(4.11) 




N 




iVh 


(=0 


N 


i,fc ^=0 


Furthermore, by applying Lemma 2.4 and the definition (4.10), we have 

^\\Qe,eVh\\l^k ^ (rT ) = (-TT ) 3 ( 0L]ae(-i;/,,(5g,^U/i) 

£=o Vl^l/ (i^Q VpI/ \ 


N 


= UT P 


iVh 


e=o 




N 


f ^ ^ Qe,i'^h 
\ i^o / 


(recalling that |0| = 1). Then, using Lemma 2.1, and then Lemma 4.2, we have 


N 




i=0 


(4.12) 


< 


< 








N 


E< 3 .. 


iVh 


e=o 

' N 


l,k 


1/2 


\\vh\\i,k 


e=o 


The result follows on combining (4.11) with (4.12). 
Remark 4.4. The use of the estimate 


□ 


3(0ag(u,u)) > yi|i;||?,fe, 

which follows from (2.12), is crucial in the proof of Theorem f.S. Indeed, the above proof uses the 
linearity of the imaginary part of a{-, •) with respect to the seeond argument. The eruder estimate 

k(^,^^)l > ^IkliP 

which also follows from (2.12), could not be used to prove Theorem f.3. 

Lemma 4.5. 

+ ^ Ik/ill i,fc forallvheV^. 


1/2 
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Proof. We first recall the decomposition of Vh as given in Lemma 4.1. Then, using Lemma 2.4, the 
definition of Qe,e and Lemma 2.1, we obtain: 


l£l 


N 


l,k 


< Sj 


[Qae{vh,Vh)] = [0ae(nh,-c;)] 


1=0 


N 


N 


1^0 1^0 

Then applying the Cauchy-Schwarz inequality and Lemma 4.1 yields 


N 


\ 1/2 . ^ \ 1/2 


\£=0 


\e=o 


1 


N 


1/2 




\e=o 


□ 


Our next key result (Lemma 4.10 below) is an estimate for the L^-error in the coarse space 
projection operator Qe,o', this is crucially needed to get good estimates for the two-grid precondi¬ 
tioner represented by In order to prove this result we need to make an assumption about the 
approximability on the coarse grid of the solution of the adjoint problem. 

Assumption 4.6 (Coarse-grid approximability of the adjoint problem). If <f) is the solution of the 
adjoint problem 


(4.13a) 

(4.13b) 

with f € then 

(4.13c) 


-A(j) - (fc^ - ie)^ = / 


n, 


df) p 

—-1770 = 0 on L 

on 


u-Mi,k < o ii/iil=(o)- 


«?^o^Vo 


Theorem 4.7. Assumption j.6 holds when Q is a 2-d convex polygon, p satisfies (2.11), and the 
coarse grid is as described in §5 (with, in particular, H denoting the mesh diameter). 

Proof. If (j) satisfies (4.13) and rj satisfies (2.11), then the coercivity estimate (2.13) combined with 
the Lax-Milgram theorem implies that 

k „ 


(4.14) 


ll'/'ll 


i.fe 


< 




If n is a convex polygon, the regularity results in [28] can then be used to show that 


(4.15) 




H2(0) 


< 


lz,2(0) 


see [21, Lemma 2.12]. Now, with denoting the Scott-Zhang quasi-interpolant on the coarse grid, 
we have 


(4.16) 


H2(o) +kHm 

m{Q.) 


[37, Theorem 4.1], and the result (4.13c) follows from combining (4.14), (4.15), and (4.16). □ 

Remark 4.8 (Bounds on the adjoint problem), (i) In the proof of Theorem j.I, we obtained the 
bound (4.14) from coercivity and the Lax-Milgram theorem. This bound can also be obtained from 
an argument involving Green’s identity (with the latter giving better estimates in the case of an 
inhomogeneous boundary condition); see [21, Remark 2.5] (but note that the p in (2.3b) of that 
paper should berj). 

(ii) The bounds (4.14) and (4.15) are the best currently-available bounds on the solution of 
(4.13) for e P k, but they are not optimal when e k - see [21, Theorem 2.9]. 
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Remark 4.9 (Establishing Assumption 4.6 for more general domains), (i) -regularity of the 
Laplacian on convex polyhedra with homogeneous Dirichlet boundary conditions is proved in [10, 
Corollary 18.18]. The analogous result for inhomogeneous Neumann boundary conditions could 
then be used, following [21, Lemma 2.12], to prove that (4.15) (and thus also Assumption 4-(>) 
holds for the solution of (4.13) on convex polyhedra with quasi-uniform meshes. 

(ii) When il is a bounded, non-convex Lipschitz polyhedron in R'*, d = 2,3, it is natural to use 
a sequence of locally-refined meshes. In this case we expect Assumption 4^.6 to hold where H is 
replaced by (l/iV)^/'^, where N is the dimension of the subspace (so (1/iV)^/'^ is the largest element 
diameter). The steps to prove this are outlined in [21, Assumption 3.7, Remark 3.8]. 

We now use Assumption 4.6 to prove the key lemma on the approximation power of Qe,o 
measured in the L^-norm on the domain. 

Lemma 4.10. (Estimate for Qe,o) For all v € Il^(fl), 

(4-17) [](/ - (5e,o)^^||L2(n) < kH \\i^ ~ <3e,o)^^||i,fe • 

Proof. In the proof, for simplicity, we write Qo instead of Qe,o- Recall that Qo is defined by the 
variational problem ae{Qov,w) = ae{v,w), for all w G Vo, and thus cq := (/ — Qo)v satifies 

(4.18) ae(eo,w) = 0 for all u; G Vq, 

Let (f) be the solution of the adjoint problem 

— A(/> — (fc^ — ie ) 4 > = Co on Vt , 
dtp _ 

——h irjtp = 0 on L. 
on 

Then, for all w G we have ae{w, tp) = (w, eo)L 2 (f 2 ). Hence, using (4.18), we can write 

(4-19) l|eo|li 2 (n) = \a,{eo,(p)\ = \a,{eo,tP - Po)\ 

for any po G Vo- Now, by Assumption 4.6, there exists a po gVq such that 

!!</>- 11 / 1 ^ 2 ( 0 ) • 

Therefore, using this last bound and continuity, we have 

(4.20) |ae(eo,<(>-((>o)| < ||eo||i.fc||<(> - ((-ollEfc < |leo|[i,fe(fci7) l|eo|lL 2 (n), 

and combining (4.20) and (4.19) we obtain (4.17). □ 

In what follows, we need both the Poincare-Friedrichs inequality and the trace inequality on 
domains D of characteristic length scale L. By this we mean that D is assumed to have diameter 
^ L, surface area ^ L‘^~^ and volume ^ L'^. The estimates in the next two results are then explicit 
in L (with the hidden constants independent of L). 

Theorem 4.11. If D is a Lipschitz domain with characteristic length scale L, then the Poincare- 
Friedrichs inequality is 

(4-21) lkllL2(D) < L|t>|i/i(£)), 

for all V G H^{D) that vanish on a subset of dD with measure ^ L'^~^, and the multiplicative trace 
inequality is 

(4.22) |k||i 2 (aB) < (L-i||u||i 2 (^) + |u|^^i(B)) |]u|]i 2 (^) , for all v G H^D) . 

Proof. For domains of size 0{1), (4.21) is proved in, e.g., [34, Theorem 1.9], and (4.22) is proved 
in [28, Last equation on p. 41]. A scaling argument then yields (4.21) and (4.22). □ 

Combining (4.21) and (4.22) we obtain the following corollary. 

Corollary 4.12. If D is a Lipschitz domain with characteristic length scale L and v vanishes on 
a subset of dD of measure ^ L‘^~^, then 

(4-23) \\v\\L'^{aD) PP . 
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At various places in this paper we make use of the simple “Cauchy inequality”: 

52 

(4.24) 2ab<Sa^ + —, a,b,6>0. 


In particular, using this (with <5=1) and the multiplicative trace inequality (4.22), we obtain 
another corollary to Theorem 4.11. 


Corollary 4.13. If D is a Lipschitz domain (with characteristic length scale 0(1)J then 

(4.25) < Iklli fe j for all v € H^{D) and k>l. 

Our goal for the rest of the section is to bound the field of values {v}nQeVh)i,kl\\vh\\\ k away 
from the origin in the complex plane. (Note that the field of values is computed with respect to 
the (^Oi.fc inner product.) We do this by estimating \{vh,QeVh)i,k\ below by J2i=o\\Qe,i'^h\\l^k 
plus “remainder” terms (which turn out to be higher order, i.e. bounded by a positive power of 
H or i?sub), and then use Lemma 4.5 to bound the sum below by Lemma 4.14 sets up 

the “remainder” terms, Re^efvh), Lemmas 4.15 and 4.16 estimate these, and the final result is then 
given in Theorem 4.17. 


Lemma 4.14. For £ = 0,..., A^, set 

(4.26) ReA'^h) ■= {{I - Qe.iAh.Qe.evAi^k- 
Then 

N 

(4.27) {vh,QeVh)l,k = '^{\\Qe,iVh\\'i,k + ReAlJh)} ■ 

e=o 

Furthermore, R^^t satisfies 

(4.28) \ReAi’h)\ < DeA'Vh) + BsAvh), 

where the “domain” and “boundary” contributions to the bound are given by 

(4.29) D^A'^h) = A\\{I - Qi.A'^h\\L'^{p.i)\\Qs/'^h\\L‘^{ni): 

(4-30) BsA'^Jh) = k\\{I - Qe,tAh\\L-^{Ti)\\Qe,tVh\\L'^(Ti)- 

and flo = n, To = T, and T^ = L fl dn^, for £ = 1,..., N. 

Proof. By the definition of Qg, 

N N 

Ah} Qe'^h)l.k — ^ ^ Ah}Qe,l'^A\^k ~ ^ ^ 4” ((-^ Qe,i)'^h} 5 

e=o e=o 

yielding (4.27). To obtain (4.28), we recall that. 

{u,v)i^k = ae{u,v) + (flA + ie)(u, v)i2(Q) + 177 ( 11 , ?;)i2(r). 

Then, since 

ae((7 Qe.lAh} Qe,i‘^h^ — 9 
(from the definition of Qe^i (4.10)), we have 

Be.flAh) = {2k +ic)((/ Q€,A^h } Q € L'^ F A{{^ Qe^A^h 7 Q e (T 7 

where we have also used the fact that Qe,iVh has support only on fl;. The desired “domain” and 
“boundary” estimates (4.29) and (4.30) then follow after using (2.6). □ 

We now bound DgAvh) and B^^A'^h)} using the following strategy. First Lemma 4.15 bounds 
De,o{vh) in terms of a positive power of H, which is obtained by using Lemma 4.10 to estimate 
the 11(7— Qe,o)n/t||L 2 (f 2 ) component of D^A'^h)- Then, in Lemma 4.16 we bound DeA'^^h) in 
terms of a positive power of iLsub, by applying the Poincare-Friedrichs inequality (4.21) to each 
of the \\Qe,iVh\\L^{ne) terms in this sum. These two lemmas also provide bounds on B^Avh) and 
B^Avh)} respectively, where similar ideas are used, except this time in conjunction with trace 
inequalities. Recalling that H is the coarse mesh diameter and iLsub the subdomain diameter, we 
are then able to control the error terms by making H and i7sub sufficiently small (Theorem 4.17); 
it turns out that the required condition on H is more stringent than that on iLsub- 
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Lemma 4.15 (Bounds on and B^^). For any a,a' >Q and any Vh G 


(4.31) 


DeA'^h) - ( Ta 


yL2\“ /jL2\-“ 

Kill 


(4.32) B,Avh) < {kHA^ 


o /P 
||Qs,0^.|llfc- 


\Ml, 


and thus (taking a' = a) 

De,o{vh) + Bsfiivh) 


(4.33) < (fcil)i/2 


2\ 1/2 


l + (fciJ)l/2 




2\ 1/2' 




2 \ 


lk/.ll 


i,fe 


Proof. For the bound on Di:fi(vh), we use (4.17), the triangle inequality, and the Cauchy inequality 
(4.24) to obtain 


DeAvh) < fc77 ( ^ ) ||(/-Qe,o)ii?i|li,fc||Q£,oii?i|li,fc, 


(4.34) 

(4.35) 




< kH 


||Qe,0ll?i|li + IIQe.O'I'/iIIi llll^illl.fe 


\\Qe,0VH\\l,+ {^^ IIK.on,||t,+ (y) WvnWl, 


for any a > 0. Since |e| < fc^, (4.31) follows. 

For the bound on B^Avh): we apply the multiplicative trace inequality (4.22) on fl (so L ^ 1), 
to obtain 

Bs,o{vh) ^ k 11(7 — Qe, 0 ) 11/1 llli(Q) ||(d — Qe,o)i1/i||Ii(q) ||Qe,0ll/i|li2('p) ■ 

Using (4.17), we then have 

/ , \ 1/2 

BeAvh) <k{kHA^ (^-j \\{I - QeAvh\\\(k \\{I - QeAvh\\]i?(^a) IIK.0ll/.||p2(r) 

and then using (4.25) and the triangle inequality we obtain 


BeAvh) < {kHA^ ( - 


||(d Qe,oAh\\i^k IIQE,Oll/i|li_fc 


2\ 1/2 , 


||Qe,Oll/i|li_fc + l|ll/i|ll,fc IIQE,Oll/i|li^fe 


' |e| 

This last inequality is the analogue of (4.34), and proceeding as before we obtain (4.32). 
Lemma 4.16 (Bounds on ^ He,^, X) ^e,e)- o:,a' > 0 and any Vh £ , 


□ 


N 


(4.36) 
and 

(4.37) 


^ ^ a kPfsub 


e=i 


N 


/ p2\ “ IV / jL2\ 

X] IIK,/ii/illi,fe + Ikiilli.i 


^ ^ Bc^i{vh) A kHgy^}) 


e=i 


1 ' N 


EIIK.^14.IIi„ 


£=1 


Ikl^lll,. 


Therefore (letting a' = a), 

N 


N / 1 . 2 \“ 1 V / 1,2 \ 

(4.38) + < kH,^, 
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Proof. Let £ = 1,... ,N. Recalling both that Qe,iVh vanishes on c?r2^\r and the assumption (3.2), 
we can use the Poincare inequality (4.21) on and then use the triangle inequality to obtain 

^ k Llsub|l(.f Qe,(.)'^h\\ 


^ kHsuh 




(where the 1, fc-norm is over the support of Qe,eVh, which is 17^). Using (4.24) we obtain 


.^E,^(^Ii) ^ ^.^sub 


fc2 


||QE,.«'*^ii|li fc + k"^ 


2 \ 


Ik.ll 


L^iQe) 


with a > 0. Summing from £ = 1 to iV, and using the finite-overlap property (3.4), gives (4.36). 
From (4.30) we have 

Bs/{vh) k \\Qe,l'<^h\\]^2(Yi) \\'*^h\\i,2(Y^-^\\Qe,l'<^h\\]^2^Yi) 

and then using (4.23) we have 

Be,i{vh) ^ k Hsub\Qe,iVh\Hi(Cli) \\v\\^2(Y^'^ \Qe,eVh\H^{ne) 

Summing from £ = 1 to we then obtain 


N 


N 


N 


(4.39) ^ kHsuh'^^\Qe,t'>Jh\H'^(Qf') + kH^.^y^'^^\\vh\\j;^2(^Ye)\Qe,tVh\H^{Qi)- 

i=l i=l i=l 

(Note that the sums in the last inequality could be restricted to those £ with F fl 917^ 0, but this 

is not used in the following.) Using the Cauchy-Schwarz inequality, then (4.23) and finally (4.24), 
we have 

\ 1/2 / jv \ 1/2 


N 


N 




e=i 


\e=i 


N 


1/2 


(4.40) 


< 


< 


/i:££sub ll'i'/tlli_fc IIQe.^'^/iII 1 kj ! 

kHsuh ^ l|ii^lli,fe 


Inserting (4.40) into (4.39), we obtain the result (4.37). 


□ 


Our main result in the rest of this section is the following estimate from below on the field of 
values of Q^. 

Theorem 4.17 (Bound below on the field of values). There exists a constant Ci > 0 such that 

(4.41) \{vh,QeVh)i,k\ > (^ + 7") hh\\i,k ^ for all Vh€V^, 

when 

(4.42) 1iff (^1 + |) (^^) | < Ci (^1 + f 

Note that the condition on the coarse mesh diameter ilsub is more stringent than the condition on 
the subdomain diameter H ; one finds similar criteria in domain-decomposition theory for coercive 
elliptic PDFs; see, e.g., [27]. 

The following corollary restricts attention to a commonly encountered situation. 

Corollary 4.18. Suppose 5 ^ Hgub H. There exists a constant Ci > 0 such that 

(4.43) \ivh,QeVh)i.k\ > W^hWl^k , for all Vh€V^, 
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when 


kH < Cl 



Proof of Theorem 4 ..17. By Lemma 4.14, 

N N 

\{vh,QeVh)l,k\ > \\Qe,lVh\\\^^, - {De,t{vh) + Bs,i{vh)) 

t=0 £=0 

Then, using the bounds (4.33) and (4.38) we have 


N 


Qe'^h)l.k\ ^ y^ IIly/; 


e=o 




2\ l/2^ 


2 \ 


II Qe,0'^/i II 


l,fe 


(^kUsuh) 




2 \ ^ 


X] \\Qe,lVh\\\, 


^=1 


2 \ 


\W\\lk 


for a, a' > 0. Therefore, there exist Ci,C 2 >0 (sufficiently small) such that 


(4.44) 
and 

(4.45) 

ensure that 


{kH) 


1/2 


p^(l+2a)/2/ /i,2\1/2N 


-V 

\e\) 


1 + (feiL)^/2 


< Cl, 


(/uiLgub) 


2 \ “ 


< C 2 


^ /jL2\- 

\{vh,QeVh)l.k\ > y^ \\Qe,eVh\\l fc - jkHsnh) ( -pT ) 
(=0 ’ 


(4.46) 


- (A:iL)C2 


2\ 1/2 


1 + (A:iL)C2 


Ik/^lll 


2\ 1/2N 


Since a > 0, there exists a Ci > 0 such that 

/ U 2\ ( 1 + 2 q ;)/2 ^ 

(4.47) (/ci7)C2 j < c'l 

ensures that (4.44) holds; i.e. (4.46) holds under (4.47) and (4.45). 

Using in (4.46) the bound in Lemma 4.5, we obtain 

\{vH.QeVK)lA ^ (l + f) IKII 



Therefore, there exist (73,(74 > 0 (sufficiently small) so that the conditions 

/px(i-2«)/2/ /i.2\C2\ , 

(4.48) (fci7)i/2 h+ (fciy)i/2('_j j<C3(^l + 

and 

(4.49) < G(l + f)"(L)’, 

together with (4.46) and (4.47), ensure that the result (4.41) holds. 


tE! 
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Now, condition (4.48) can be rewritten as 


H 




-1 


2-0! 


Now, since e < 


H 


-1 


1 + -I ipi f.i 


for any /3 > 0. Therefore, if a < 2, then there exists a Cs > 0 such that the condition (4.48) is 
ensured by the condition 




< C3 1 + 


H 


-1 


2-ot 


i.e. 


(4.50) 




2\ (5-2o)/2 


< ^3 ( 1 + ? 


-1 


In summary (from (4.47), (4.45), (4.50), and (4.49)) we have shown that there exist Ci, (72, C^, > 

0 such that the required result (4.41), holds if the following four conditions hold: 

r2 \ (l+2a)/2 


(4.51) 

(4.52) 

(4.53) 
and 

(4.54) 


(A:i7)i/2 


< Cl, 


(fc77sub) ( o ) ^ <^2, 


(fci7)C2 


{kHsuh) 


2\ (5-2a)/2 


1 + 


H 


<C3, 


2-a' 


1 4—r 1 ^ C 4 , 


where 0 < a < 2 and a' > 0. 

The optimal choice of a to balance the exponents in (4.51) and (4.53) (ignoring the factor 
(1 + H/6)) is a = 1, and the optimal choice of a' to balance the exponents in (4.52) and (4.54) 
(again ignoring (1 + H/6)) is a' = 1. With these values of a and a', the four conditions above are 
ensured by the condition (4.42). □ 


Remark 4.19 (One-level methods). Inspecting the proof of Theorem f.l7, we see that the bound 
from below on the field of values relies on the bound in Lemma f.S, which in turn relies on the 
second bound in Lemma f.l. In the case of the one-level method (i.e. is preconditioned with 
(3.1)^, the constant on the right-hand side of the analogue of the second bound in (4.1) does not 
^ 1 when 6 ~ H; instead it blows up as H ^ 0. This is why we do not currently have a result 
analogous to Theorem /.I? for the one-level method. 


5. Matrices and convergence oe GMRES 

In this section we interpret the results of Theorems 4.3 and 4.17 in terms of matrices and 
explain their implications for the convergence of GMRES for the Helmholtz equation. Let us 
begin by recalling the convergence theory for GMRES due originally to Elman [13] and Eisenstadt, 
Elman and Schultz [12], and used in the context of domain decomposition methods in [4]. The 
most convenient statement for our purposes is [1]. We consider any abstract linear system 

(5.1) (7x = d 

in C", where (7 is an n x n nonsingular complex matrix. Choose an initial guess , introduce the 
residual r° = d — Cyp and the usual Krylov spaces: 

/C”"(G, r°) := span{G^r° : j = 0,..., m - 1} . 
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Let {■,-)d denote the inner product on C” induced by some Hermitian positive definite matrix 
D, i.e. 

(5.2) {V,W)d ■■=W*DV 

with induced norm || • Hd, where * denotes Hermitian transpose. For m > 1, define x"* to be the 
unique element of /C™ satisfying the minimal residual property: 

||r™||z, := ||d-C'x™||i5 = min Hd-CxHi), 

x€AC”^(C,r^) 

When D = I this is just the usual GMRES algorithm, and we write || • || = || • H/, but for more 
general D it is the weighted GMRES method [19] in which case its implementation requires the 
application of the weighted Arnoldi process [29]. In §6 we give results for standard GMRES which 
corresponds to the case D = I and also for a weighted variant with respect to a certain matrix D 
defined by (5.4) below. The following theorem is a simple generalisation of the classical convergence 
result stated in [1]. 


Theorem 5.1. Suppose 0 ^ Wd{C). Then 


(5.3) 


\D 


l|r°||D 


< sin"*(/3) , where cos(/3) := 


dist(0, Wd{C)) 


IICIId 

where Wd{C) denotes the field of values (also called the numerical range of C) with respect to the 
inner product induced by D, i.e. 

Wd{C) = {(x,Cx),5 :xeC",|]x|],, = l}. 


Proof. For the “standard” case D = I the result is stated in [1]. For general H, write C = 
, d = X = U^/^x, x™ = and Then it is easy to see that 

x™ £ IC(C,r^) and it satisfies the “standard” GMRES criterion for the transformed system but in 
the Euclidean norm: 

|[?-||:=||d-Cx™|| = min_ l|d-Gx||. 

5 e/C"*(C,rO) 

Then we know that the result (5.3) holds with D = I, C = C and r™ = r™. It is then simple to 
transform this back to obtain (5.3) in the case of general D. □ 


Remark 5.2. Note that for all x £ C" with |]x|[d = 1, we have 

0 < dist(0,ITz3(C)) < |(x,C'x)i3| < ||C|[r3 

and so the second formula in (5.3) necessarily defines an angle jS in the range [0,7r/2]. Thus, for 
good GMRES convergence we aim to ensure that dist(0, Wd(C')) is bounded well away from zero 
and that IjGI]!) is as small as possible. Theorem 5.1 could therefore be viewed as a generalisation to 
the case of GMRES of the familiar condition number criterion for the convergence of the conjugate 
gradient method for positive definite systems. The result of Theorem 5.1 is stated without proof in 
[4], with a reference to [13]; however [13] is concerned only with standard GMRES in the Euclidean 
inner product. 


Remark 5.3. As we see in Theorems 5.6 and 5.8 below, the analysis of provides us with 
estimates for the norm and field of values of the preconditioned matrix in the weighted norm 
induced by the real symmetric positive matrix defined in (5.4) below. Other analyses of domain 
decomposition methods for non self-adjoint or non-positive definite PDEs (e.g. [5], [36]^ have 

arrived at analogous estimates in weighted norms, although the weights appearing in these previous 
analyses are different, being associated with either the standard P[^ norm or semi-norm, and not 
the k- weighted energy norm, appropriate for Helmholtz problems, used here. 


We now use the theory in §4 to obtain results about the iterative solution of the linear systems 
arising from the Helmholtz equation. We start by interpreting the operators Qe,^ defined in (4.10) 
in terms of matrices. 


Theorem 5.4. Let Vh = ^ Then 

(^) Qe,iVh = V (rJA;IRiA,v) , £ = 1,...,N , 
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( m ) Qe,oVh = (^0 , 

pel" 

with ,i = 0,..., N defined in (3.1) and (3.6). 

Proof. These results are similar to those for symmetric elliptic problems found for example in [42] , 
so we will be brief. For (i), let £ G {1, ■ • ■ let Wh,i and yh,i be arbitrary elements of Ve, and 
denote their coefficient vectors W and Y (with nodal values on all of X^). Then W = Rjw and 
Y = Rjy, where w,y have nodal values onl^(n^). The definitions of and A^j, (2.4) and (3.1), 
then imply that aeiyh.t^Wh.t) = 'W* AfiY = w* A^^iy. So if y := Af\RiAfiV for some V G C", we 
have 

ae{yh,uwh,d = ^*ReA,V = {rJw)*A,V = W*A,V = a,{vh,WH,e), 

where yh,i is the finite element function with nodal values y. Thus, by definition of Qe,i, we have 
yh,e = Qe,i'>Jh, which implies the result (i). The proof of (ii) is similar. □ 


The main results of the previous section - Theorems 4.3 and 4.17 - give estimates for the norm 
and the field of values of the operator Qe on the space V^, with respect to the inner product (•, ■)i,k 
and its associated norm. In the following we translate these results into norm and field of values 
estimates for the preconditioned matrix Bfi^gAg in the weighted inner product (•, •)d^, where the 
weight matrix is : 

(5.4) Dk:=S + k‘^M, 

and S and M are defined in (2.5). In fact is the matrix representing the (•, •)i^fe inner product 
on the finite element space in the sense that if Vh,Wh G with coefficient vectors V, W then 

(5.5) {vh,wh)i,k = (V,W),3,. 

Theorem 5.5. Let Vh G V^. Then 

(*) {vh,QeVh)l,k = T Dk 

{^^) WQeVhh.k = WBfiXsAenn, 


Proof. For arbitrary Wh,Vh G V^, with coefficient vectors W and V, using Theorem 5.4, we have 
{wh,Qe,eVh)i,k = {W,RjAflReA,V}D„ £ = 0,...,N. 

Summing these over i = 0,..., N and using (3.1), (3.6), and (3.7), we obtain 

{Wh, Qe'l^h)l,k — , 

from which (i) and (ii) follow immediately. □ 


The following main result now follows from Theorems 4.3, 4.17, and 5.5. 

Theorem 5.6 (Main result for left preconditioning). 

(*) ll'®e,.4sA||l3fc for all H,Hsub- 

Furthermore, there exists a constant Ci such that 

\{V,B;}gA,V)D,\ > (l + f) (^) lYIlk^ forallVGC 

when 

(5.6) max | iff [l + |]) [G]) | < Ci [l + |]) [M] , 

Combining Theorem 5.1 and Theorem 5.6 we obtain: 
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Corollary 5.7 (GMRES convergence for left preconditioning). Consider the weighted GMRES 
method where the residual is minimised in the norm induced hy Dk (see, e.g., [29]J. Let r™ denote 
the mth iterate of GMRES applied to the system A^, left preconditioned with Then 



provided condition (5.6) holds. 


As a particular example of Corollary 5.7 we see that, provided |e| ^ , E[,Hsub ^ k~^ and 

6 ^ E[, then GMRES will converge with the number of iterations independent of all parameters. 
This property is illustrated in the numerical experiments iir the next section, all of which concern 
the case 6 ^ i7sub H. These experiments also explore the sharpnesss of the result (5.7) in the 
two cases: (i) |e| decreases below for fixed El and (ii) El increases above for fixed e. Our 
experiments show that there may be room to improve the theoretical results. 

While Corollary 5.7 provides rigorous estimates only for weighted GMRES, we see in §6 that 
there is, in fact, very little difference between the results with weighted GMRES and standard 
GMRES, so most of our experiments are for standard GMRES. The difficulty of proving results 
about standard GMRES in the context of domain decomposition for non self-adjoint problems was 
previously investigated by other researchers; see, e.g., [5]. 

In the next section we also explore the use of B~\g as a preconditioner for A. A particularly 
effective preconditioner is obtained with H ^ k~^ and e ^ k. However this preconditioner has a 
complexity dominated by the cost of inverting the coarse mesh problem. A multilevel variant where 
the coarse problem is approximated by an inner GMRES iteration (within the EGMRES format) 
is also proposed and is demonstrated to be very efficient for solving finite element approximations 
of the Helmholtz equation with h ^ 

Some of our experiments below use right preconditioning rather than left preconditioning. Nev¬ 
ertheless, using the coerciveness for the adjoint form in Corollary 2.6, we can obtain the following 
result about right preconditioning, however in the inner product induced by Erom this, the 

analogue of Corollary 5.7, with Dk replaced by D'^^, follows. 

Theorem 5.8 (Main result for right preconditioning). With the same notation as in Theorem 
4-17, we have 

{i) < (|e[) for all H,Hsub- 

Eurthermore, provided condition (5.6) holds, 

(**) |(V,A,H-i 5 V)^-i| > (l + f) (^) llVf^-i, forallVGC-. 

Proof. To simplify the notation, we write B~^ instead of B~\g. An easy calculation shows that 
for all V € C" and with W = , we have 

KV,A,i?,-iV)^-i| ^ I((i?;)-1A:W,W),,J ^ |(W,(H*)-ia;W),,J 
(V,V)^-. (W,W),,, (W,W),,, 

where A*, (i?*)“^ are the Hermitian transposes of A,Bj^ respectively. The coercivity of the 
adjoint form proved in Corollary 2.6 then ensures that the estimate in Theorem 5.6 (ii) also holds 
for the adjoint matrix and the result (ii) then follows. The result (i) is obtained analogously from 
taking the adjoint and using Theorem 5.6 (i). □ 

Corollary 5.9 (GMRES convergence for right preconditioning). Under the same assumptions, 
the result of Corollary 5 .7 still holds when left preconditioning is replaced by right preconditioning. 

Remark 5.10 (The truncated sound-soft scattering problem). We now outline how the results can 
be adapted to hold for the truncated sound-soft scattering problem. By this, we mean the exterior, 
homogeneous Dirichlet problem, with the radiation condition imposed as an impedance boundary 
condition on a far-field boundary. That is, 

(5.8a) —Alt — {k^ ie)u = f in Tl, 






(5.8b) 

(5.8c) 
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— -irju = g on d^ln, 

u = 0 on dflo, 

where flo is the scatterer and flu is a bounded Lipschitz domain with fin C. flR. With / = 0 and 
an appropriate choice of g, the solution of the above problem is a well-known approximation to the 
sound-soft scattering problem (see, e.g., [21, Problem 2.4] for more details). 

The variational formulation of this problem is almost identical to that of the interior impedance 
problem in ^2, except now the Hilbert space is {v € H^(fl) : v = 0 on 912 _d} cind the integrals 
over r in (2.2) and (2.3) are over dflR. The essential Dirichlet boundary condition means that 
the nodes on dflo are no longer freedoms. Thus, the domain decomposition technique for this 
problem are almost the same as those described in detail above, except that the subdomains will 
have Dirichlet conditions not only at interior boundaries, but also on any part of their boundary 
that intersects with dflo- The rest of the results in go through as before, and therefore 

analogues of Theorems 5.6 and 5.8 and Corollaries 5.7 and 5.9 hold for the truncated problem. 

6 . Numerical Experiments 

Our numerical experiments discuss the solution of (2.4) on the unit square, with rj = k, discre- 
tised by the continuous linear finite element method on a uniform triangular mesh of diameter h. 
The problem to be solved is thus specified by the choice of h and e which we denote by 

( 6 . 1 ) 

^prob and Cprob • 

We will discuss the case Cprob > 0 in Experiment 1 (with results in Tables 1, 2, 3); the empirical 
observations from these results are then used to motivate a preconditioner for the pure Helmholtz 
problem (Cprob = 0) in Experiments 2 and 3 (with results in Tables 4, 5, 6, 7). 

We will be focused on solving systems with hprob k~^l'^ (the discretisation level generally 
believed to remove the pollution effect; see, e.g., the literature reviews in [21, Remark 4.2] and [24, 
§1.2.2]), however the case hprob ~ fc (a fixed number of grid points per wavelength) appears as a 
relevant subproblem when we construct multilevel methods in Experiment 3 below. 

In the general theory given in §3, coarse grid size H and subdomain size Hgub are unrelated, 
but in our experiments here we construct local subdomains by taking each of the elements of the 
coarse grid and extending them to obtain an overlapping cover with overlap parameter 6. This is 
chosen as large as possible, but with the restriction no two extended subdomains can touch unless 
they came from touching elements of the original coarse grid. In this scenario 6 ^ H (generous 
overlap), iJsub ~ H and our preconditioners are thus determined by choices of H and e, which we 
denote by 

(6.2) TIprec and ^prec ■ 

In our preconditioners the coarse grid problem is of size ^ T{~)(,^ and there are ~ local 

problems of size (i7prec/hprob)^- If there were no overlap, the method would be “perfectly load 
balanced” (i.e. local problems of the same size as the coarse problem) when ifprec = Iiprob- Thus, 
for load balancing, 

(6.3) iJprec ^ when hprob ^ k~^/‘^. 

(Because of overlap, the local problems are larger than estimated in (6.3), and the method is in fact 
loadbalanced for somewhat finer coarse meshes than those predicted in (6.3). We will investigate 
both cases when Eprob and ffprec are equal and cases when Eprec > £prob = 0. The question of 
greatest practical interest is: if Eprob = 0, how to choose Eprec and i7prec in order to maximise the 
efficiency of the preconditioner? This question is addressed towards the end of these experiments, 
but first we illustrate the theoretical results in §5 which are about the case Eprec = £prob 0. 

The first preconditioner considered is the Classical Additive Schwarz (AS) preconditioner defined 
in (3.7). We will also be interested in variants of this that replace the local component (3.1) with 
something else. The first variant involves averaging in the overlap of the subdomains. For each 
fine grid node Xj {j G T^), let Lj denote the number of subdomains which contain Xj. Then the 
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local operator is: 

^ , for each j G , 

^ i\ Xj ^ 

and the corresponding Averaged Additive Schwarz (AVE) preconditioner is: 

(6-4) ^e,ave = ^o^e,o^o + 'B^,AyE,;oco;- 

The second variant is the Restrictive Additive Schwarz (RAS) preconditioner, which is well-known 
in the literature [3], [31]. Here to define the local operator, for each j G choose a single i = 
with the property that Xj G . Then the action of the local contribution, for each vector of fine 
grid freedoms v, is: 

(6-5) iB~RAS,iocai^)3 = {bJu)Kaj)^ , for each j G J'' , 

and the RAS preconditioner is 

(6-6) -®£,RAS = BqA^ IRo + ' 

All three of these variants of Additive Schwarz can be used in a hybrid way. This means that 
instead of doing all the local and coarse grid problems independently (and thus potentially in 
parallel), we first do a coarse solve and then perform the local solves on the residual of the coarse 
solve. This was first introduced in [32] . As described in [25] , this is closely related to the deflation 
method [33], which has been used recently to good effect in the context of shifted Laplacian 
combined with multigrid [38]. We will show results for the Hybrid RAS (HRAS) preconditioner 
which takes the form 

(6-7) •= BqA^ IRq + Pq RAS,local) Bq , 

where 

To = I~A^RqAq^Rq . 

All our results are obtained with GMRES without restarts, with the implementation done in 
python. The results in Experiment 1 are obtained both with left preconditioning and with right 
preconditioning (flexible GMRES); the relevant python codes are scipy. sparse. linalg.gmres 
and pyamg.krylov.fgmres respectively. The other two experiments are done with right precondi¬ 
tioning. In all experiments we use standard GMRES, which minimises the residual in the standard 
Euclidean inner product. However, motivated by Theorem 5.6 and Gorollary 5.7, Experiment 1 
also discusses the results of applying left preconditioned GMRES in the inner product induced by 
the matrix Dk- For this we use the algorithm described in [29], editing an existing GMRES code 
to work with the inner product induced by Dk- In all experiments the starting guess is zero and 
the residual reduction tolerance is set at I0“®. 

6.1. Experiment 1. Here we solve (2.4) with f = 1, hp^h = ,with various choices of 

£prob = e^prec and iTprec- We use triangular coarse grids. The results in Section 5 tell us that, 
provided 

^prob — ^prob ^ k and kHpj-ec CO 1, 

then the number of GMRES iterations with the preconditioner AS will remain bounded as fc —>• oo. 
Our first set of results are for the regular (Euclidean inner product) GMRES algorithm. Tables 1, 
2 and 3 give results for a range of ffprob = e^prec and Tfprec, assuming that Tfprec = for different 
choices of a. The number of iterations of the Classical Additive Schwarz method is denoted by 
while the number of iterations for the variants using averaging, RAS and Hybrid RAS are 
denoted #ave, #ras and #hras- 

In Table 1, the results for a = 1 conhrm the result of Corollaries 5.7 and 5.9. The other parts 
of this table show that in fact when Eprob = £prec = k'^ then the iteration counts remain bounded 
as k increases for a range of TIprec chosen to decrease more slowly with k than the theoretical 
requirement of 0{k~^). Thus, if there is enough absorption, the preconditioner still works well for 
solving the shifted system, even for much coarser coarse meshes than those predicted by Corollaries 
5.7 and 5.9. The case iJprec = ® is close to being load balanced (see (6.3) and the remarks 

following). To give an idea of the sizes of the systems involved, when iJprec = and k = 120 
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Left preconditioning 

Right preconditioning 



a 

= 1 



a 

= 1 


k 

i^AS 

if AVE 

4^ RAS 

if HRAS 

if AS 

if AVE 

if RAS 

if HRAS 

10 

21 

15 

15 

8 

21 

15 

15 

8 

20 

20 

15 

15 

8 

19 

15 

15 

8 

40 

21 

16 

16 

9 

19 

16 

15 

8 

60 

21 

16 

16 

9 

19 

16 

15 

8 

80 

26 

18 

16 

9 

23 

17 

15 

8 

100 

21 

17 

16 

9 

19 

16 

15 

8 

a = 0.9 


a 

= 0.9 


k 

#AS 

if AVE 

44 RAS 

if HRAS 

if AS 

if AVE 

if RAS 

if HRAS 

10 

19 

15 

15 

8 

19 

15 

15 

8 

20 

23 

18 

18 

9 

21 

18 

17 

8 

40 

27 

21 

19 

10 

24 

19 

17 

9 

60 

25 

20 

20 

10 

21 

20 

18 

9 

80 

25 

21 

20 

10 

21 

20 

18 

9 

100 

25 

21 

20 

10 

21 

20 

18 

9 

a = 0 

.8 


a 

= 0.8 


k 

#AS 

if AVE 

44 RAS 

if HRAS 

if AS 

if AVE 

44ras 

if HRAS 

10 

19 

15 

14 

8 

18 

15 

14 

8 

20 

21 

18 

17 

9 

20 

18 

17 

9 

40 

23 

22 

19 

10 

20 

20 

17 

10 

60 

21 

20 

19 

11 

18 

19 

17 

10 

80 

21 

20 

19 

11 

18 

19 

17 

10 

100 

22 

23 

19 

11 

18 

20 

17 

10 


Table 1. Number of iterations for various preconditioners with /ipj-ob = k 

Cprob — Cprec — k , i^prec — k 


the size of the fine grid problem is n = 1782225 while the size of the coarse grid problem is 2116 
and there are 2025 local problems of maximal size 3364 to be solved. We also note the overall 
improvement as we compare different preconditioners in the sequence AS, AVE, RAS, HRAS. 

Then Tables 2 and 3 repeat the same experiments for the cases £prob = Cprec = k and Eprob = 
Eprec = 1- We observe here that when iAprec = k~^, both methods continue to work quite well 
(although the number of iterations does grow mildly with fc), however for coarser coarse meshes 
Hprec = k~°‘ with a < — 1, the method quickly becomes unusable. The general superiority of HRAS 
over the other methods is striking. A * in the tables indicates that the number of iterations was 
above 200. 

To hnish Experiment 1, we repeated the experiments above with left preconditioning but where 
the GMRES algorithm minimises the residual in the norm induced by Dk- The resulting iteration 
counts were almost identical to those given in Tables 1, 2 and 3, so we do not give them here. 

Note that the results in Table 2, especially the columns corresponding to a = 1 and HRAS, 
show that is a good (although admittedly not perfect) preconditioner for Ak- Based on 
[21] this strongly suggests will be a good preconditioner for A (recall properties (i) and (ii) 
in the introduction); we see that this is indeed the case in the next experiment which is about 
preconditioners for A. 

While Experiment 1 illustrates very well our theoretical results about preconditioning the prob¬ 
lem with absorption (i.e. the matrix A^), the ultimate goal of our work is to determine the best 
preconditioner for the problem without absorption (i.e. the matrix A). Therefore, the rest of our 
experiments focus on investigating this question, and so from now on we take £prob = 0. Also, in 
the experiments above, HRAS outperformed all the other preconditioners and so in the rest of our 
experiments we restrict attention to HRAS. 

Moreover, remembering that the local solves in are solutions of local problems with 

a Dirichlet condition on interior boundaries of subdomains, and noting that these are not expected 
to perform well for genuine wave propagation (i.e. e small), we also consider the use of impedance 
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Left preconditioning 

Right preconditioning 



a 

= 1 



a 

= 1 


k 

#AS 

#AVE 

^RkS 

#HRAS 

#AS 

#AVE 

#RAS 

#HRAS 

10 

25 

16 

16 

10 

25 

17 

16 

9 

20 

25 

20 

20 

11 

24 

20 

20 

11 

40 

35 

30 

31 

16 

32 

28 

28 

14 

60 

46 

41 

42 

22 

40 

38 

37 

19 

80 

67 

56 

55 

30 

56 

50 

48 

24 

100 

75 

69 

70 

38 

64 

63 

61 

31 

a = 0.9 


a = 

= 0.9 


k 

#AS 

#AVE 

^RkS 

#HRAS 

#AS 

#AVE 

^RkS 

#HRAS 

10 

22 

18 

18 

10 

22 

18 

18 

10 

20 

37 

27 

28 

14 

35 

27 

27 

13 

40 

118 

45 

85 

24 

107 

42 

83 

21 

60 


171 

192 

40 

* 

175 

187 

35 

80 

* 

* 

* 

61 

* 

* 

* 

54 

100 

* 

* 

* 

97 

* 


* 

86 

a = 0.8 


a = 

= 0.8 


k 

#AS 

#AVE 

#RAS 

#HRAS 

#AS 

#AVE 

#RAS 

#HRAS 

10 

23 

18 

19 

12 

23 

18 

18 

12 

20 

40 

33 

35 

18 

37 

33 

34 

17 

40 

173 

140 

190 

122 

153 

130 

177 

116 

60 

* 

* 

* 

* 

* 

* 

* 

* 

80 

* 

* 

* 

* 

* 

* 

* 

* 

100 

* 

* 

* 

* 

* 

* 

* 

* 


Table 2. Number of iterations for various preconditioners with hprob 

^prob — £^prec — -^prec — ^ 


fc-3/2, 


boundary conditions on the local solves. We therefore introduce the sesquilinear form local to the 
subdomain defined as the following local equivalent of (2.2): 

ae,imp,e{v,w) = / Vu.VW — + ze) / vw — ik / uuJ, 

J Of J Of J 50f 

(remember that we are choosing rj = k in all the experiments). We let Agjmp,e be the stiffness 
matrix arising from this form, i.e. 

{A^ j^p j>^j jf = (le,l7np,E{,4^j'7 ; jij €T(n^). 

This can be used as a local operator in any of the preconditioners introduced above. For example 
if it is inserted into the HRAS operator (6.7), then the one-level variant is 

(6-8) iKinip,RAS,iocai^)j = {RJu)KjrnpAj)^^U)^)^ ^ for each jel^. 

Here (noting the distinction with (3.1)), R( denotes the restriction operator {Re)j,j' = Sjj', (as 
before) j' ranges over all but now j runs over all indices such that Xj G H^\r. The hybrid 
two-level variant is 

(6-9) Be}mp,HRAS '= ^0 ^e,0^0 + Pq (^Pe,Imp,RAS,local) Pq ■ 

We refer to these as the one- and two-level ImpHRAS preconditioners. 


6.2. Experiment 2. In Tables 4 and 5 below, we illustrate the performance of these precondition¬ 
ers with various choices of Eprec when solving the problem (2.4) with Eprob = 0 and hp^oh = k~^l'^. 
Here we use rectangular coarse grids and subproblems and employ right (FGMRES) precondition¬ 
ing (although the performance with triangular grids and left preconditioning is similar). In order 
to ensure our problem (2.4) has physical significance, we choose the data f,g in (2.3) so that the 
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Right preconditioning 


a 

= 1 


#AS 

#AVE 

#RAS 

#HRAS 

26 

17 

17 

10 

24 

21 

20 

11 

33 

30 

30 

15 

43 

41 

40 

20 

62 

56 

53 

27 

71 

70 

68 

34 


a = 

= 0.9 


#AS 

#AVE 

t^^ras 

?7^HRAS 

22 

18 

18 

10 

37 

28 

28 

13 

121 

45 

93 

23 

* 


* 

39 

* 

* 

* 

62 

* 

* 

* 

101 


a = 

= 0.8 


#AS 

#AVE 

#RAS 

#HRAS 

23 

20 

19 

13 

40 

35 

36 

17 

187 

186 

* 

181 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 



Left preconditioning 



a 

= 1 


k 

#AS 

#AVE 

#RAS 

#HRAS 

10 

26 

17 

17 

10 

20 

25 

21 

21 

12 

40 

37 

32 

32 

17 

60 

49 

44 

45 

24 

80 

74 

63 

61 

33 

100 

83 

78 

79 

43 

a = 0.9 

k 

#AS 

#AVE 

#RAS 

#HRAS 

10 

23 

18 

18 

10 

20 

38 

28 

30 

14 

40 

134 

49 

96 

26 

60 

* 

* 


44 

80 

* 

* 

* 

71 

100 

* 

* 

* 

115 

a = 0.8 

k 

#AS 

#AVE 

#RAS 

#HRAS 

10 

23 

20 

19 

13 

20 

42 

35 

39 

19 

40 

* 

194 

* 

182 

60 

* 

* 

* 

* 

80 

* 

* 

* 

* 

100 

* 

* 

* 

* 


Table 3. Number of iterations for various preconditioners with hprob = k 

Cprob — ^prec — 1; -^prec — k 


exact solution of problem (2.1) is a plane wave u{x) = exp{ikx.d) where d = {11^/2, l/\/2)'^. In 
Tables 4 and 5 respectively, iteration counts for the two level versions of HRAS and ImpHRAS are 
given, with the counts for the corresponding one level method given as subscripts. From these 
tables we make the following observations: 

(1) When i/prec = k~^^ the two-level versions of both HRAS and ImpHRAS perform quite well, 
although the number of iterations does grow mildly with k. The corresponding one-level 
versions perform poorly, showing that the coarse grid operator is doing a good job in this 
scenario. The choice of Cprec has minimal effect (except that it appears that ffprec should 
not be chosen much bigger than . 

(2) When iAprec = and a < 1, HRAS becomes unusable. ImpHRAS also degrades as a 
decreases but then starts to improve again, and at a = 0.6 provides a reasonably efficient 
solver with very slow growth of iterations with k. Here the two-grid variant is not much 
better than the one-grid variant, due to the fact that the coarse grid problem has become 
very coarse (when k = 80, n = 511225, the size of the coarse grid problem is 196, and the 
size of each of the largest local problem is 10404. 

At the end of Experiment 1, we argued that the fact that was a good preconditioner for Ak, 
along with the results of [21], suggested that would be a good preconditioner for A. Having 
performed Experiment 2, we can now compare preconditioning Ak with B^^ to preconditioning A 
with B^^. Relevant results are in the column in Table 2 for right-preconditioning with HRAS and 
a = 1 and the column in Table 4 with a = 1 and /3 = 1. Although the iteration counts are slightly 
different, a linear least-squares fit shows that the rate of growth with k is very similar in each case 
(^0.60 versus which is in line with the intuition above. 

6.3. Experiment 3. From observations above, we can identify a possible multilevel strategy for 
preconditioning the problem with hprob = k~^/^ and £prob = 0. We do this only in 2D using the 
experiments above, but it is possible to carry out a similar analysis in 3D. 
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a 

= 1 




k\f3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

IO34 

IO34 

1134 

1134 

1234 

1533 

1934 

20 

1292 

1292 

1292 

1292 

1392 

1092 

3793 

40 

18* 

18* 

18* 

18* 

18* 

25* 

63* 

60 

25* 

25* 

25* 

25* 

25* 

32* 

86* 

80 

34* 

34* 

33* 

33* 

32* 

39* 

no* 

100 

45* 

45* 

44* 

43* 

42* 

47* 

136* 




a - 

= 0.8 




k\f3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

I625 

I624 

1624 

I624 

1724 

I824 

1926 

20 

2859 

2859 

2859 

2859 

2459 

3O58 

3961 

40 

** 

** 

175* 

139* 

96* 

65i6o 

76i32 

60 


** 



169* 

114* 

119i97 

80 



** 

** 


166* 

165* 

100 




** 



** 




a - 

= 0.6 




k\f3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

I618 

I618 

1618 

I618 

I618 

16i9 

1922 

20 

5572 

55n 

5367 

5163 

4858 

3946 

3943 

40 

140i24 

138129 

131135 

125i33 

114i25 

8694 

8I76 

60 




** 


147i4i 

113102 

80 




** 


178i6o 

135i2i 

100 




** 





Table 4. Number of iterations for HRAS with Sprob = 0, Eprec = , -ffprec = k 

and right preconditioning. 


The first step is to use a two level HRAS with, say, Eprec = k and iJprec = 0(k~^). Denote 
the number of iterations of this method by Ii{k). (A least squares linear fit of the data for the 
two-level method in Column 5 of the first pane of Table 4, indicates that Ii{k) ~ 0{k'^'^).) Each 
application of the preconditioner requires the solution of one system of size = 0{k'^) and 

Oik^) systems of size (i?prec/^prob)^ ~ 0{k). Each matrix-vector multiplication with the system 
matrix needs 0(k^) operations. The total cost is then approximately: 


( 6 . 10 ) 


Ji(fc) 


C(fc2) +0(k'^)C{k)+ 0{k^) 

Coarse solve local solves matrix—vector multiplications 


where C (m) denotes an estimate of the cost of backsolving with a factorized m x m finite element 
system in 2D (the theoretical upper bound is C(m) ~ (Only backsolves need be counted 

since these appear in every iteration while the factorization needs only be done once.) 

The local solves in (6.10) can in principle be done in parallel, so that the main bottleneck is 
likely to be the (relatively large) coarse solve. For this reason we consider replacing the direct 
coarse solve with an inner iteration within an FGMRES set-up. Thus we need an efficient iterative 
method for the coarse problem, which itself is a Helmholtz finite element system with ft-prob ~ k~^, 
£prob = k. Table 6 gives some experiments with preconditioned iterative methods for problems of 
this form in the two cases /iprob = 7r/10fc and /iprob = 7r/5fc (20 and 10 grid points per wavelength 
respectively), using ImpHRAS with Eprec = k and i?prec = k~^^'^. Iteration counts are given for the 
two-level variant (and the one-level variant as subscripts). These show that the one-level method 
works just as well as (sometimes even better than) the two level method. With the number of 
iterations denoted by l 2 {k), a least-squares linear fit to the one-level data for hprob = 7r/5fc yields 
the estimate l 2 {k) ^ k'^-^. 

If we use this method (in its one level form) to approximate the coarse solve in (6.10), then each 
application of the preconditioner requires the solution of 0{k) systems of size 0{k) and the total 
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a 

= 1 




k\f 3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

1545 

1545 

1545 

1546 

1546 

1747 

2I49 

20 

17io4 

17io4 

17105 

17io5 

I8105 

20107 

34ii3 

40 

21* 

21, 

21, 

21, 

21, 

26, 

56, 

60 

27, 

27, 

27, 

27, 

27, 

33, 

78, 

80 

36, 

36, 

35, 

35, 

34, 

40, 

101, 

100 

47, 

47, 

46, 

45, 

43, 

48, 

123, 




a - 

= 0.8 




k\f 3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

1527 

1527 

1527 

1527 

1527 

I628 

1929 

20 

2356 

2356 

2356 

2356 

2356 

2557 

3462 

40 

51io6 

51io6 

51106 

51io6 

49io6 

48io8 

63ii6 

60 

107i5o 

106i5o 

105i5o 

104i5o 

99i5o 

85i5i 

99i64 

80 

187i94 

185i93 

183i93 

178i93 

I68193 

132i94 

138, 

100 



*=(= 

** 


185, 

179, 




a - 

= 0.6 




k\f 3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

14i8 

14i8 

14i8 

14i8 

14i9 

1520 

1723 

20 

2731 

2731 

2631 

2631 

2632 

2833 

3642 

40 

5151 

5151 

5051 

5O51 

4851 

5O51 

7366 

60 

72ri 

7171 

7071 

6971 

6970 

7067 

10491 

80 

7485 

7485 

7485 

7484 

7483 

7777 

126111 

100 

84g8 

84g8 

8498 

8497 

84g5 

8637 

148i3i 


Table 5. Number of iterations for ImpHRAS with Eprob = 0, Cprec = , -ffprec = 

k~°‘, and right preconditioning. 


k 

hproh = 7r/5fc 

^prob — ir/lOk 

10 

9io 

9 g 

20 

14i5 

14i5 

40 

2 I 24 

2224 

60 

3O32 

3132 

80 

3535 

3735 

100 

3938 

4139 

120 

4240 

4543 

140 

4643 

4946 


Table 6. Number of iterations for ImpHRAS with Eprob = k = Eprec, -ffprec = k ° 
and right preconditioning. 


cost of the solution is about hik) {kC{k) +0{k‘^)'). Using this to replace the first term on the 
right-hand side of (6.10), the cost of the resulting inner-outer algorithm would be approximately 

(6.11) h(k) [hik) {kC{k) + e) + 0{e)C{k) + 0(fc3)] . 

Now it is well-known that in 2D fast direct solvers for finite element systems of size k perform 
with 0{k) complexity for k < 10^. So for practically relevant wavenumbers we expect a complexity 
for the inner-outer algorithm of the form Ii{k) [fc^/ 2 (fc) -|- C>(fc^)] . 

In Table 7, we illustrate the performance of this inner-outer method (which we denote lO-ImpHRAS) 
implemented within the FGMRES framework. The outer tolerance is 10“® (as before, and the in¬ 
ner tolerance r is as indicated). Below each iteration count we present also the total running time 
of our reference NumPy implementation; this includes the setup time of all (sub)matrices. We also 
give an average time for each outer iteration. 
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T 

= 0.01 




k\l3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 


15 { 5 ) 

15 ( 5 ) 

15 ( 5 ) 

15 ( 5 ) 

15 ( 5 ) 

17 ( 4 ) 

21 ( 3 ) 


0.69 [0.02] 

0.75 [0.02] 

0.73 [0.03] 

0.77 [0.03] 

0.74 [0.03] 

0.75 [0.02] 

0.82 [0.02] 

20 

17 ( 7 ) 

17 ( 7 ) 

17 ( 7 ) 

17 ( 6 ) 

18 ( 6 ) 

20 ( 4 ) 

34 ( 2 ) 

4.49 [0.12] 

4.38 [0.12] 

4.34 [0.11] 

4.35 [0.11] 

4.42 [0.11] 

4.38 [0.10] 

5.18 [0.08] 


21 ( 11 ) 

21 ( 11 ) 

21 ( 11 ) 

21 ( 10 ) 

21 ( 9 ) 

26 ( 6 ) 

56 ( 2 ) 


62.9 [0.96] 

62.8 [0.96] 

63.1 [0.96] 

62.4 [0.93] 

62.1 [0.91] 

63.3 [0.81] 

82.8 [0.73] 

60 

27 ( 15 ) 

27 ( 15 ) 

27 ( 15 ) 

27 ( 14 ) 

27 ( 12 ) 

33 ( 6 ) 

78 ( 2 ) 

420.9 [4.13] 

422 [4.13] 

423 [4.10] 

421 [4.01] 

416 [3.87] 

426 [3.46] 

560 [3.21] 

80 

36 ( 17 ) 

36 ( 17 ) 

35 ( 16 ) 

35 ( 15 ) 

34 ( 12 ) 

40 ( 6 ) 

101 ( 2 ) 

1536 [11.1] 

1564 [11.1] 

1608 [10.89] 

1555 [10.5] 

1540 [10.0] 

1542 [8.91] 

2052 [8.54] 

100 

47 ( 21 ) 

47 ( 21 ) 

46 ( 20 ) 

45 ( 18 ) 

43 ( 15 ) 

48 ( 7 ) 

123 ( 2 ) 

4061 [20.9] 

4281 [21.1] 

4073 [19.6] 

3992 [20.5] 

4078 [18.4] 

3880 [17.1] 

5130 [17.5] 




T 

= 0.1 




fe\/3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 

10 

16 ( 3 ) 

16 ( 3 ) 

16 ( 3 ) 

16 ( 3 ) 

16 ( 3 ) 

17 ( 2 ) 

19 ( 2 ) 

0.60 [0.02] 

0.60 [0.02] 

0.61 [0.02] 

0.60 [0.02] 

0.59 [0.02] 

0.60 [0.02] 

0.73 [0.02] 

20 

17 ( 4 ) 

17 ( 4 ) 

17 ( 4 ) 

17 ( 4 ) 

18 ( 4 ) 

20 ( 3 ) 

34 ( 1 ) 

3.88 [0.10] 

3.84 [0.09] 

3.78 [0.09] 

3.91 [0.09] 

3.93 [0.09] 

3.98 [0.09] 

4.86 [0.08] 

40 

21 ( 7 ) 

21 ( 7 ) 

21 ( 7 ) 

21 ( 6 ) 

21 ( 6 ) 

26 ( 4 ) 

56 ( 1 ) 

56.9 [0.83] 

56.9 [0.83] 

56.8 [0.82] 

56.4 [0.81] 

56.8 [0.83] 

60.3 [0.76] 

80.6 [0.70] 

60 

27 ( 9 ) 

27 ( 9 ) 

27 ( 8 ) 

27 ( 8 ) 

27 ( 7 ) 

34 ( 4 ) 

82 ( 1 ) 

386 [3.65] 

384 [3.63] 

384 [3.58] 

382 [3.55] 

379 [3.46] 

396 [3.24] 

543 [3.13] 

80 

36 ( 10 ) 

35 ( 10 ) 

35 ( 10 ) 

35 ( 9 ) 

34 ( 8 ) 

40 ( 4 ) 

104 ( 1 ) 

1361 [9.55] 

1389 [9.51] 

1398 [9.40] 

1344 [9.34] 

1368 [9.10] 

1332 [8.49] 

1926 [8.35] 

100 

47 ( 13 ) 

46 ( 13 ) 

46 ( 12 ) 

45 ( 11 ) 

43 ( 10 ) 

48 ( 4 ) 

126 ( 1 ) 

3699 [18.6] 

3723 [19.3] 

3700 [18.1] 

3535 [17.1] 

3687 [16.7] 

3530 [16.1] 

4935 [16.9] 




T 

= 0.5 




k\l3 

0 

0.4 

0.8 

1 

1.2 

1.6 

2.0 


17 ( 2 ) 

17 ( T ) 

18 ( T ) 

18 ( 1 ) 

18 ( T ) 

19 ( T ) 

23 ( T ) 


0.61 [0.02] 

0.59 [0.02] 

0.66 [0.01] 

0.66 [0.02] 

0.65 [0.02] 

0.66 [0.01] 

0.65 [0.01] 

20 

19 ( 2 ) 

19 ( 2 ) 

19 ( 2 ) 

19 ( 2 ) 

19 ( 2 ) 

25 ( 1 ) 

36 ( 1 ) 

3.86 [0.08] 

3.72 [0.08] 

3.72 [0.08] 

3.68 [0.08] 

3.66 [0.08] 

4.00 [0.07] 

4.96 [0.07] 

40 

22 ( 4 ) 

22 ( 4 ) 

22 ( 4 ) 

22 ( 3 ) 

22 ( 3 ) 

28 ( 2 ) 

61 ( 1 ) 

54.8 [0.73] 

54.9 [0.73] 

54.8 [0.72] 

54.7 [0.71] 

54.8 [0.71] 

58.0 [0.69] 

80.4 [0.68] 

60 

28 ( 5 ) 

28 ( 5 ) 

28 ( 5 ) 

28 ( 5 ) 

28 ( 4 ) 

35 ( 2 ) 

82 ( 1 ) 

370 [3.20] 

371 [3.20] 

372 [3.19] 

370 [3.16] 

369 [3.11] 

383 [3.00] 

539 [3.10] 

80 

36 ( 6 ) 

36 ( 6 ) 

36 ( 6 ) 

36 ( 5 ) 

35 ( 5 ) 

42 ( 2 ) 

104 ( 1 ) 

1288 [8.62] 

1375 [8.69] 

1300 [8.59] 

1316 [8.51] 

1273 [8.38] 

1323 [8.08] 

1909 [8.19] 

100 

46 ( 8 ) 

46 ( 8 ) 

46 ( 7 ) 

46 ( 7 ) 

44 ( 6 ) 

49 ( 2 ) 

126 ( 1 ) 

3533 [16.5] 

3678 [16.01] 

3586 [16.4] 

3471 [15.9] 

3483 [16.2] 

3503 [15.5] 

4832 [16.4] 


Table 7. IQ — ImpHRAS with ^prob = 0 and £prec = • Bold font: number of 

outer (inner) iterations). Non-bold font: total time in seconds [with an average 
time for each outer iteration in square brackets] 


We see from the table that the commonly-used choice e = performs much worse that the 
choice oi e = for the values of /3 < 2 considered here; of these latter choices, e = k seems to be 
the best choice for this composite algorithm. The best times were obtained for a fairly large inner 
tolerance r; the results for e = A:,r = 0.5 show a growth of the total compute time with k that is 
close to 0{k'^) = Note that the runs were performed on a single CPU core; the method 

is highly parallelisable and parallel implementation results will be presented in a future paper. 

Acknowledgements. We thank Melina Freitag, Stefan Giittel, Jennifer Pestana and Andy 
Wathen for valuable advice about various aspects of weighted GMRES. We also thank Clemens 
Pechstein for valuable comments. 
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